Phonetic Distance Based Accent Classifier to Identify Pronunciation Variants and Oov Words
نویسندگان
چکیده
The state-of-the-art Automatic Speech Recognition (ASR) systems lack the ability to identify spoken words if they have non-standard pronunciations. In this paper, we present a new classification algorithm to identify pronunciation variants. It uses Dynamic Phone Warping (DPW) technique to compute the pronunciation-by-pronunciation phonetic distance and a threshold critical distance criterion for the classification. The proposed method consists of two steps; a training step to estimate a critical distance parameter using transcribed data and in the second step, use this critical distance criterion to classify the input utterances into the pronunciation variants and OOV words. The algorithm is implemented using Java language. The classifier is trained on data sets from TIMIT speech corpus and CMU pronunciation dictionary. The confusion matrix and precision, recall and accuracy performance metrics are used for the performance evaluation. Experimental results show significant performance improvement over the existing classifiers.
منابع مشابه
Finding recurrent out-of-vocabulary words
Out-of-vocabulary (OOV) words can appear more than once in a conversation or over a period of time. Such multiple instances of the same OOV word provide valuable information for estimating the pronunciation or the part-of-speech (POS) tag of the word. But in a conventional OOV word detection system, each OOV word is recognized and treated individually. We therefore investigated how to identify ...
متن کاملCrossTowns: Automatically Generated Phonetic Lexicons of Cross-lingual Pronunciation Variants of European City Names
The CrossTowns lexicons are part of a study that focuses on the phonetic variants that occur when speakers of different native languages (L1) with varying degrees of target language (L2) proficiency pronounce foreign city names. Based on a collection of speech data from this domain, it is one of the aims to identify the most common pronunciation errors in a particular L1/L2 pair (language direc...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملAutomatic Prediction of Intelligibility of Spoken Words in Japanese Accented English
This study examines automatic prediction of the words that will be unintelligible if they are spoken by Japanese speakers of English. In our previous study [1], 800 English utterances spoken by Japanese speakers, which contained 6,063 words, were presented to 173 American listeners and correct perception rate was obtained for each spoken word. By using the results, in this study, we define the ...
متن کاملA phonetic explanation of pronunciation variant effects.
Effects of word-level phonetic variation on the recognition of words with different pronunciation variants (e.g., center produced with/(out) [t]) are investigated via the semantic- and pseudoword-priming paradigms. A bias favoring clearly articulated words with canonical variants ([nt]) is found. By reducing the bias, words with different variants show robust and equivalent lexical activation. ...
متن کامل